Adding uint4 dtype implementation #13

jerryzh168 · 2023-11-23T01:07:56Z

Stack from ghstack (oldest at bottom):

-> Adding uint4 dtype implementation #13

Summary:
This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this

we plan to move the uint4 tensor subclass to core after it is more mature

Test Plan:
python test/dtypes/test_int4.py

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: fc5acbb Pull Request resolved: #13

torchao/dtypes/int4.py

cpuhrsch · 2023-11-28T23:50:35Z

torchao/dtypes/int4.py

+    uint8_data = uint8_data.contiguous().view(-1)
+    return (uint8_data[::2] << 4 | uint8_data[1::2]).view(down_size(shape))
+
+class UInt4Tensor(torch.Tensor):


We're creating a whole new Tensor here, but really we want to extend a QTensor with a new lower precision backend? Or do we want to combine UInt4Tensor with a QTensor, where QTensor has a UInt4Tensor for storing int4 weights (next to scales etc.)?

by QTensor do you mean https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/quantized/QTensorImpl.h? I think we are moving away from this and just rely on PyTorch native dtypes now

@cpuhrsch I guess by QTensor you mean these tensors: https://github.com/pytorch-labs/ao/blob/main/torchao/quantization/subclass.py#L178, This UInt4Tensor can compose with that (it's similar to uint8 tensor), see the example below (PerChannelSymmetricWeightUInt4Tensor)

bdhirsh · 2023-11-29T17:29:32Z

The next major issue I think you'll hit is that with torch.compile, we don't have a recipe today for "send subclass directly to the compiler, instead of desugaring it". Most of the pieces for this to work should already exist, but let me know and once the other errors are fixed we can try to sit down together and get this to work.

jerryzh168 · 2023-11-29T21:37:09Z

The next major issue I think you'll hit is that with torch.compile, we don't have a recipe today for "send subclass directly to the compiler, instead of desugaring it". Most of the pieces for this to work should already exist, but let me know and once the other errors are fixed we can try to sit down together and get this to work.

oh the suggestion from Alban is that we'll desugar it for now, since triton don't have int4 support anyways and we are using handwritten custom kernels, the int4 is just for representation in frontend

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 1a20256 Pull Request resolved: #13

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: d9c853f Pull Request resolved: #13

torchao/dtypes/int4.py

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: aa80ed3 Pull Request resolved: #13

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 6647651 Pull Request resolved: #13

test/dtypes/test_int4.py

iseeyuan · 2023-12-14T22:13:04Z

Are you planning to add int4 dtype from ATen C++ side?

jerryzh168 · 2023-12-15T00:21:23Z

Are you planning to add int4 dtype from ATen C++ side?

no we don't, we'll use bits8 instead, and encode int4 related information in ops that is using these tensors, e.g. dequantize_int4(bits8_tensor, scale, zero_point, qmin, qmax, ...)

test/dtypes/test_int4.py

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 812be7a Pull Request resolved: #13

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 05f1c94 Pull Request resolved: #13

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 9a07ab7 Pull Request resolved: #13

Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs python test/test_quantization.py -k test_uint1_7_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

torchao/dtypes/int4.py

Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs python test/test_quantization.py -k test_uint1_7_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 0277ba4 Pull Request resolved: #117208

Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs python test/test_quantization.py -k test_uint1_7_dtype Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: #117208 Approved by: https://github.com/ezyang

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: be18a64 Pull Request resolved: #13

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 888f289 Pull Request resolved: #13

torchao/dtypes/uint4.py

bdhirsh · 2024-01-19T15:09:44Z

torchao/dtypes/uint4.py

+    zero_point: int,
+) -> torch.Tensor:
+    inv_scale = 1.0 / scale
+    return pack_uint4(torch.clamp(torch.round(input * inv_scale) + zero_point, 0, 15).to(torch.uint8))


for inductor and export, how do you want these routines to be optimized? Is there a custom kernel for the quantize/dequantizing? Or would you prefer to for inductor to "see" this decomposition, so it can try to fuse it into a triton kernel.

I think it will be the latter for now, but maybe in the future we can have custom kernels

torchao/dtypes/uint4.py

bdhirsh · 2024-01-19T15:11:20Z

torchao/dtypes/uint4.py

+            self, size = args
+            size = utils.infer_size(size, self.numel())
+            assert not kwargs
+            # WARNING: views not preserved


why aren't we preserving the view?

this is copied from Edward's original PR, not exactly sure why, feels like this is creating a new tensor so we don't have view relationship? is view supposed to share storage?

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 4b6082c Pull Request resolved: #13

torchao/dtypes/uint4.py

HDCharles

lgtm

Summary: This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this we plan to move the uint4 tensor subclass to core after it is more mature Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 51cf717 Pull Request resolved: #13

Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 51cf717 Pull Request resolved: pytorch#13

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 23, 2023

jerryzh168 mentioned this pull request Nov 23, 2023

Adding uint4 dtype implementation #10

Closed

bdhirsh reviewed Nov 28, 2023

View reviewed changes

torchao/dtypes/int4.py Outdated Show resolved Hide resolved

cpuhrsch reviewed Nov 28, 2023

View reviewed changes

torchao/dtypes/int4.py Outdated Show resolved Hide resolved

cpuhrsch reviewed Nov 28, 2023

View reviewed changes

jerryzh168 commented Nov 30, 2023

View reviewed changes

torchao/dtypes/int4.py Outdated Show resolved Hide resolved

bdhirsh reviewed Dec 13, 2023

View reviewed changes

test/dtypes/test_int4.py Outdated Show resolved Hide resolved

leslie-fang-intel reviewed Dec 15, 2023

View reviewed changes

test/dtypes/test_int4.py Outdated Show resolved Hide resolved

jerryzh168 mentioned this pull request Dec 18, 2023

Faster Pytorch dequantize() + matmul for quantized models pytorch/pytorch#115985

Open

airMeng reviewed Jan 12, 2024

View reviewed changes

torchao/dtypes/int4.py Outdated Show resolved Hide resolved

bdhirsh reviewed Jan 19, 2024

View reviewed changes

torchao/dtypes/uint4.py Outdated Show resolved Hide resolved

bdhirsh reviewed Jan 19, 2024

View reviewed changes

torchao/dtypes/uint4.py Outdated Show resolved Hide resolved

bdhirsh reviewed Jan 19, 2024

View reviewed changes

jerryzh168 requested a review from bdhirsh January 22, 2024 18:06

shuokay reviewed Jan 26, 2024

View reviewed changes

torchao/dtypes/uint4.py Show resolved Hide resolved

HDCharles approved these changes Feb 9, 2024

View reviewed changes

jerryzh168 merged commit 8ab52a7 into gh/jerryzh168/1/base Feb 10, 2024

jerryzh168 deleted the gh/jerryzh168/1/head branch February 10, 2024 05:50

Adding uint4 dtype implementation #13

Adding uint4 dtype implementation #13

Uh oh!

Conversation

jerryzh168 commented Nov 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cpuhrsch Nov 28, 2023

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Nov 29, 2023

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Dec 21, 2023

Choose a reason for hiding this comment

Uh oh!

bdhirsh commented Nov 29, 2023

Uh oh!

jerryzh168 commented Nov 29, 2023

Uh oh!

Uh oh!

Uh oh!

iseeyuan commented Dec 14, 2023

Uh oh!

jerryzh168 commented Dec 15, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bdhirsh Jan 19, 2024

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Jan 20, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bdhirsh Jan 19, 2024

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Jan 20, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HDCharles left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jerryzh168 commented Nov 23, 2023 •

edited

Loading